Date: 05/22/2019
Data source
Developer Information
This is second part of a two-part series on accessing the International Debt Statistics (IDS) database through the World Bank Data API. In the first part (LINK NEEDED), we queried the World Bank Data API to retrieve indicator names and location codes. In this guide, we will use that information to explore the regional trends of long-term external debt stocks from the IDS database. The following code will show step-by-step how to:
To start, make sure you have the following packages installed on your machine. To install an R package type install.packages(“WDI”) into R. You can also visit each of the linked sites below for reference.
Then, open up your preferred mode of writing R. This could be in R Studio. Now follow the rest of the steps below to retreive and analyze the World Bank data.
# Load packages
library(WDI)
library(ggplot2)
library(plotly)
Now you can specify the the data that you want to explore using the following parameters:
The World Bank’s Data Catalog contains indicator codes for each data series. In this guide, we will be looking at “long-term external debt stock” from the IDS data. To find the indicator for the data in which you’re interested, you can either explore the World Bank data catalog or use an API query (LINK NEEDED) as outlined in Part 1 of this series. The IDS indicators are also conveniently stored as a spreadsheet (LINK NEEDED) in this repo.
# selecting the indicator
dataSeries = "DT.DOD.DLXF.CD"
To select a location by country, region, or income level you will need to know its 2 or 3 letter code. To figure out what this code is, you can either find it on the WITS System site, use an API query as outlined in Part 1 of this series (LINK NEEDED), or use the convenient location-codes spreadsheet* in this repo.
We will select regional aggregates (excluding high-income countries):
# Select the countries or regions
location = c("ECA","SSA","SAS","LAC","MNA","EAP")
Here you will select the time frame for the data you are retrieving, by picking the first and last year of the time frame.
# Selecting the time frame
firstYear = 2008
lastYear = 2017
In this step, we will retrieve the data using the World Bank API call. The package “WDI” can request information from the World Bank database … MORE HERE
data = WDI(indicator=dataSeries, country=location, start=firstYear, end=lastYear)
If you want a quick preview of your freshly retrieved DataFrame, you can print just the first 5 lines
head(data)
## iso2c country DT.DOD.DLXF.CD year
## 1 4E East Asia & Pacific (excluding high income) 1.263986e+12 2017
## 2 4E East Asia & Pacific (excluding high income) 1.158415e+12 2016
## 3 4E East Asia & Pacific (excluding high income) 1.035285e+12 2015
## 4 4E East Asia & Pacific (excluding high income) 1.029315e+12 2014
## 5 4E East Asia & Pacific (excluding high income) 8.397770e+11 2013
## 6 4E East Asia & Pacific (excluding high income) 7.688467e+11 2012
Congratulations! At this point you should have the long-term external debt stock for regions (excluding high-income economies) from 2008 - 2017 all in a DataFrame called “data.”
Now we can do:
As you saw in the preview of the data in section 3, the data needs some basic cleaning.
The data for the long-term external debt stock is currently in units. To improve a table’s or chart’s readability, I will convert the units to billions and change the visible decimal places to zero.
# change units to billions
data$DT.DOD.DLXF.CD = data$DT.DOD.DLXF.CD/1000000000
# hide decimal places
data$DT.DOD.DLXF.CD = round(data$DT.DOD.DLXF.CD, 0)
These next sections of code will clean up the naming of headers and data. First, it will rename the column headers. Second, it will remove the redundant “(excluding high income)” from the region names. We can instead include that information in the title of the legend.Finally, it will remove the “iso2c” column.
# rename column headers
colnames(data)[colnames(data)=="country"] <- "Region"
colnames(data)[colnames(data)=="DT.DOD.DLXF.CD"] <- "ExternalDebtStocks"
colnames(data)[colnames(data)=="year"] <- "Year"
# remove "(excluding high income)" from each of the region names
data$Region <- gsub("excluding high income", "", data$Region)
# removes the parenthesis
data$Region <- gsub("\\()", "", data$Region)
# remove the iso2c column
data <- subset(data, select = -c(iso2c))
Now our data should be ready to present in a table or visualize. Let’s take a look at the first five lines again so we can compare the cleaned up data to the output in section 3.
head(data)
## Region ExternalDebtStocks Year
## 1 East Asia & Pacific 1264 2017
## 2 East Asia & Pacific 1158 2016
## 3 East Asia & Pacific 1035 2015
## 4 East Asia & Pacific 1029 2014
## 5 East Asia & Pacific 840 2013
## 6 East Asia & Pacific 769 2012
I will use the package “Plotly” to create a line graph, similar to one from the blog post on the launch of IDS 2019.
p<-plot_ly(
data, x = ~Year, y=~ExternalDebtStocks,
type="scatter", mode = "lines",
hoverinfo = "text",
text=~paste("External Debt Stock: $",ExternalDebtStocks, "B", "<br>Region: ",Region, "<br>Year: ",Year),
color=~Region)%>%
layout(
title= "Long-term External Debt Stocks ($ billion)"
)
p